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One of the most tantalizing results from the WMAP experiment is the suggestion that the power 
at large scales is anomalously low when compared to the prediction of the "standard" ACDM model. 
The same anomaly, although with somewhat larger uncertainty, was also previously noted in the 
COBE data. In this work we discuss possible alternate models that give better fits on large scales 
and apply a model-comparison technique to select amongst them. We find that models with a cut 
off in the power spectrum at large scales are indeed preferred by data, but only by a factor of 
3.6, at most, in the likelihood ratio, corresponding to about "1.6a" if interpreted in the traditional 
manner. Using the same technique, we have also examined the possibility of a systematic error in 
the measurement or prediction of the large-scale power. Ignoring other evidence that the large-scale 
modes are properly measured and predicted, we find this possibility somewhat more likely, with 
roughly a 2.5(J evidence. 

PACS numbers: 98.80.Cq 



I. INTRODUCTION 

The recent WMAP results have provided a spec- 
tacular view of the early Universe. One of the most 
intriguing results offered by the WMAP team is that 
the CMB anisotropy power on the largest angular scales 
seems to be anomalously low In fact, the WMAP 

team report that this result has a high statistical signifi- 
cance, quoting a probability ranging from just under 1% 
to 2 X 10~^ for such a result, depending on the details of 
the analysis. This low power can be seen in two comple- 
mentary ways. First, in the CMB power spectrum, C^, 
the quadrupole {£ — 2) and octopole {£ = 3) both seem 
low in comparison to the smooth "best fit" model, as 
shown in Figure ^ The latter is selected from the array 
of models with a flat geometry and nearly scale invariant, 
adiabatic primordial fluctuations. 
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FIG. 1: The CMB power spectrum at low I as measured by 
WMAP. The solid line is the best fit using the "standard" 
power law ACDM model. Note that the error bars at low 
multipoles are almost entirely due to cosmic variance. 
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FIG. 2: The correlation function, C{9) as computed from the 
WMAP team, from the pixelized map (solid line); using the 
CfS measured by WMAP (long dashed line), using WMAP's 
best fit Ce (short dashed), using the WMAP data with C2 and 
C3 changed to equal those of the best fit (dotted), and using 
the best fit Ces with lowered values of C2 and C3 (dot-dash). 



The low power seems particularly striking when the 
CMB anisotropy correlation function, 

C(6l) = (r(n)r(m)) with n -111 = cos 6* (1) 

is examined: it is very near zero for 9 ^ 60°. Note that 
the average implied by the angle brackets has sev- 
eral different, inequivalent, interpretations: The WMAP 
team estimate the correlation function calculated as the 
simple average over pixels at a given separation. If we 
interpret the average as an ensemble average, however, 
we can relate the correlation function to the power spec- 
trum, Ce: 
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For a Gaussian distribution with enough samples, these 
two definitions are nearly equivalent, since the pixel av- 
erage will approximate the ensemble average. We were 
able to reproduce the character of the correlation func- 
tion from the published angular power spectrum, by sum- 
ming the Legendre series in Eq. In fact, we obtained 
almost the same result by using the smooth best-fit spec- 
trum, but with the quadrupole and octopole lowered to 
the observed levels, as also shown in Figure |2 (In fact, 
the correlation function in this case is actually flatter at 
6 ~ 180° than those computed from the actual data: the 
power in any of the correlation functions calculated from 
real data shows a lower correlation amplitude than those 
calculated from smooth power spectra.) Conversely, rais- 
ing the quadrupole and octopole in the observed spec- 
trum to the predicted levels removes the anomaly. This 
exercise implies that the low power is just that: low power 
at low i, and due neither to a conspiracy of particular C'£ 
values nor to any non-Gaussian distribution of the mul- 
tipole moments themselves. Moreover, the apparently 
striking difference between the measured and predicted 
C{9) is due entirely to the low values of the quadrupole 
and octopole. In this paper, we investigate the statistical 
significance of these measurements. 

In the following, we introduce the Bayesian model com- 
parison method in Section m discuss models with low 
primordial power in Section IIIII and a model of experi- 
mental or theoretical systematic errors in Section llVl We 
conclude with a discussion in Section IVI 



II. MODEL COMPARISON 

The question remains, then; how significant is this ob- 
served low power? Here, we shall answer this question 
using the technique of Bayesian model comparison 0, 01 . 
This technique has been used before in various cosmolog- 
ical contexts 5„ ,£1, i . 

We start, as usual, with Bayes' theorem, which gives 
the posterior probability of some theoretical parameters 
9 given data D under the hypothesis of some model m: 



p{e\Di,n) = p{e\h 



P{D\OIm) 

P{D\I.m) 



(3) 



where P{A\B) gives the probability or probability density 
of a proposition A given a proposition B and, here, all 
probabilities are conditional, at least on the background 
information , which refers to the background informa- 
tion for a specific model m. The model parameters 9 (the 
list of which may actually depend on which model m we 
consider), have prior probability P{9\Im)- The likelihood 
function is P{D\9Im), and the so-called "evidence" is 



P{D\I^) = J d9P{9\I^)P{D\9Irr 



(4) 



which enforces the normalization condition for the poste- 
rior but is also quite properly the probability of the data 
given model m, the "model likelihood" . 



We can further factor the evidence as 



PiD\Irn) ^ Crai9n,^.)0„ 



(5) 



where ^max are the parameters that maximize the likeli- 
hood for model m, Cm{9) = P{D\9Im), and Om is the 
so-called "Ockham Factor" Q . The Ockham factor is es- 
sentially the ratio of the prior probability volume to the 
posterior probability volume. (This is most easily seen 
for the case where both prior and posterior are uniform 
distributions. When both are Gaussian distributions, the 
Ockham factor is the ratio of the determinants of the co- 
variance matrices, which is indeed the ratio of the la 
volumes.) 

In order to select among models, one usually employs 
the ratio of their probabilities: 



P{m\DI) _ P{m\I) P{D\Im) _ P{ni\I) 
P{n\DI) ^ P{n\I) P{D\In) ~ P{n\I) 



(6) 



Any experimental information is contained in the ratio 
of the evidence, B^n, which is referred to as the "Bayes 
factor" . Lacking any prior information preferring one 
model over the other, Eq. ® only depends on the Bayes 
factor. Eqns. (QHSj) imply that the Bayes factor incor- 
porates the essence of the Ockham razor: since the ev- 
idence is an average of the likelihood function with re- 
spect to the prior on the parameters, simpler models 
having a more compact parameter space are favored, un- 
less more complicated models fit the data significantly 
better. Bayes factors are likelihood ratios, and can be 
interpreted roughly as follows, as suggested in Ref. j^: 
If 1 < Bmn ^ 3, there is an evidence in favor of model 
TO when compared with n, but it is almost insignificant. 
If 3 < Bmn ^ 20, the evidence for to is definite, but 
not strong. Finally, if 20 ^ Bmn 150, this evidence is 
strong and for Bmn ^ 150 it is very strong. 

We can also interpret the likelihood ratio in the same 
manner as we compute the "number of sigma" by which 
a value or hypothesis is favored. In this case the model 
is favored by cr with i> ~ ^^2 In |-Bmn|- Another useful 
interpretation, perhaps more familiar to the engineering 
community, would be to use decibels, 0.1 logj^Q Bmn 3]. 

The model comparison formalism outlined here re- 
quires us to specify alternatives to the "fiducial" stan- 
dard model. Thus a sharper version of our question 
might be: is it more probable that the data do reflect 
a standard Big Bang, with nearly-scale invariant, adia- 
batic, isotropic, Gaussian fluctuations, or do they come 
from a Universe with, say, a cutoff in the power spec- 
trum? Or could there be a problem in the data analysis 
so that, say, the error bars are larger than thought, or 
the reported results somehow exhibit an over-subtraction 
of large-scale power? In the following we shall examine 
these possibilities. 

The "fiducial" standard model is the best-fit model 
from . It is a flat ACDM Friedmann-Robertson- Walker 
universe, with baryon density fit, — 0.046 and "dark en- 
ergy" density JIa = 0.73 (in units of the FRW critical 
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density). It has a power-law initial matter power spec- 
trum with spectral index Ug = 0.99 and a present-day 
expansion rate of Hq = lOOh km sec^^ Mpc~^ with 
h = 0.72. The power spectrum amplitude is As — 0.855, 
as defined in the CMBFast program 10] and as used by 
the WMAP team 11], related to the amplitude of fluc- 
tuations at fco = 0.05 Mpc~^. 

The evidence for this model is simply the likelihood 
P{D\9 Ifiduciai) evaluated at the best fit values of the pa- 
rameters. We calculate the likelihood using the code 
provided by the WMAP team ([ll|), which correctly 
accounts for correlations between values of £ and the 
non-Gaussian shape of the distribution. For the fidu- 
cial model it is equal to 0.00094, which is the value that 
we will need when comparing to other models. 



III. LOW-POWER MODELS 

A. A flat Universe with a cutoff in the primordial 
spectrum 

The most obvious way to lower the CMB power spec- 
trum is to lower the power in the primordial density 
power spectrum P{k) [11 El El El El. Since the 
CMB is the product of small fluctuations in the primor- 
dial plasma, we can use linear theory. To each multi- 
pole i there corresponds a transfer function Ti^k), such 
that e{£ + l)Ci = 27r / dink Ti{k)k^P{k). The transfer 
function depends on the cosmological parameters, but is 
peaked at approximately krjo ^ where 770 is the current 
size of the universe, of order 770 1.5 x lO"' Mpc. Lower- 
ing power at fc < 6 X lO"'' Mpc"^ thus lowers the CMB 
power spectrum for £ ^ 4. 

A simple model where such a cutoff was imposed by 
fiat was considered by Contaldi et al ]pj| . They used the 
following form for the primordial spectrum: 
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P{k) = Po{k) 



1 - e 



(7) 



where Po{k) ~ Ak"" is the usual power law primordial 
spectrum. They rightly determine that the data favor 
a cutoff at fee ~ (5-6) x lO"" Mpc'^ In [H Contaldi 
et al considered another class of models with the cutoff 
produced by altering the shape of the inflaton potential. 
Here, we concentrate on the lower multipoles alone and 
consider the effect of varying only the location of the 
power cutoff using Eq. lO with a = 1.8. This reasonably 
assumes that there is enough freedom in the model space 
to allow the high-€ spectra to adjust to flt the data, and 
that the transfer function, Ti^k), does not change much 
at low £. 

In Figure 121 we show the CMB power spectrum at low 
multipoles with several cutoff models, and in Figure 0] 
we show the CMB likelihood as a function of the cutoff 
scale, kc- These figures essentially reproduce the results 
of Contaldi et al. 
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FIG. 3: CMB power spectra for various values of the cutoff 
parameter kc of Eq. |7| measured in units of 10"'' Mpc"^. 
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FIG. 4: The likelihood as a function of the cutoff wavenumber 
kc for the model of Section fill Al 



It is clear that the cutoff allows for a better fit than 
the so-called best fit. Next we evaluate the evidence 
for this model with kc as the only parameter, with the 
prior p{kc) = P(fcclcutoff) chosen to be fiat in the region 
[0,0.001] Mpc~^ We obtain 

Ppjcutoff) = J dka pike) Cikc) = 0.0025 . (8) 

This value is 2.6 times the evidence for the fiducial model, 
which implies that the cutoff model is preferred only at 
approximately 1.4cr level. We have also calculated the 
Ockham factor for this model, defined in Eq. ©, to be 
0.441. 

Choosing a fiat prior over this region emphasizes values 
of the cutoff near kc 0.5 x 10~^ Mpc""'^, so in fact imple- 
ments a sort of fine tuning. We might instead use a prior 
proportional to 1/fcc (i.e., linear in Infcc), which also has 
the advantage of having the same form if we switch vari- 
ables to the cutoff length Ic oc 1/kc- If we choose a lower 
limit of 10"'' Mpc"^, the evidence is unchanged from 
the value for the fiat prior, but as we decrease the lower 
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FIG. 5: The CMB power spectrum for different curvature 
values in the closed model of 1111 Bl 
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FIG. 6: The likelihood as a function of Hq for the closed 
model of HTTHl 



limit the evidence becomes dominated by the plateau at 
kc 0, where the models approach the fiducial best fit. 
The limiting value of the evidence is thus the same value 
as for the fiducial model itself: the maximum likelihood 
for this model may be quite large, but the Ockham factor 
is small. 



B. Geometry: A Closed Universe 

CMB measurements indicate that the geometry of the 
universe is very nearly flat. This is consistent with the 
inflationary paradigm in which the universe, unless ad- 
ditionally fine-tuned, would be expected to be infinitesi- 
mally close to flat today. However, a slightly closed uni- 
verse is also consistent with the current data and is actu- 
ally marginally preferred by the WMAP experiment 
whose best fit value was ftk = —0.02 ± 0.02. 

When calculating theoretical predictions for CMB 
anisotropy spectra one is faced with the so-called geo- 
metric degeneracy among the values of matter density, 
curvature and dark energy density [T^ . Given fixed val- 
ues for flcdmh^ , Q,i,h? and acoustic peak location param- 
eter one can produce almost identical CMB spectra by 
choosing the values of h and fifc along a degeneracy line 
in the (/i, fifc) space. The differences between spectra 
are only notable on large scales {(. ^ 20) where the ISW 
contribution to the anisotropy due to the dark energy 
component is dominant. 

A closed universe contains a characteristic scale - the 
curvature scale Re- The eigenvalues (3 of the Laplacian 
are, therefore, discrete and related to the physical wave- 
number k via 0^ = 1 + k'^R'^ with modes corresponding to 
(3—1 and 2 being unphysical pure gauge modes. As ar- 
gued in 10, if the universe was indeed marginally closed, 
in the absence of a concrete model it is not obvious how 
the concept of scale invariance should be extended to 
scales comparable to the curvature scale. One of the pos- 
sibilities could be that the spectrum would truncate on 



scales close to R. A heuristic formula for the primordial 
spectrum, illustrating such a possibility, was suggested 
in [13: 

We have used Eq. ^ to generate CMB anisotropy spec- 
tra for various values of Qk chosen to lie along the same 
geometrical degeneracy line that contained WMAP's best 
fit flat ACDM model. The results are shown in Figure O 

As can be seen from the plot, the truncated closed 
models fit the data considerably better than WMAP's 
best fit model. 

Next we calculate the evidence for this model with 
h as the free parameter. The prior p{h) was taken to 
be a Gaussian with the mean h = 0.72 and variance 
ah ~ 0.10, and additionally constrained to be in the 
range [0.52, 0.72]. The lower bound is dictated by current 
experimental constraints on the value of h, while the up- 
per bound follows from the fact that along the geometric 
degeneracy line higher values of h would correspond to 
rifc > 0. We find that the evidence for this model is 

P(i:>|closed) J dhp{h) C{h) = 0.0034 , (10) 

where C{h) is the likelihood of data given a particular 
value of h. The obtained evidence is approximately 3.6 
times that of WMAP's best fit model. This can be inter- 
preted as the closed model being preferred over the best 
fit model at a 1.6cr level, which, considering the absence 
of a robust model of a marginally closed universe, is in- 
sufficient to warrant abandoning simple inflation as the 
base model for fltting data. The Ockham factor for this 
model (Eq. ©) is 0.370. 

In addition, we have considered the same closed uni- 
verse model but with the spectral index and the value 
of (Tg also allowed to vary, to see if the fit could be im- 
proved further. The prior on Ug was chosen to be Gaus- 
sian with n = 0.97 and cr„ = 0.07 and restricted to the 
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FIG. 7: Likelihood contours in the (n, h) parameter space 
for the closed model of Section llil Bl marginalized over the 
value of (78 . Shown are the 1 and 2a contours, defined by the 
equivalent likelihood ratio for a two-parameter Gaussian dis- 
tribution. The point that maximizes the likelihood function 
is marked with an asterisk (*). 



interval [0.83,1.11]. The prior on erg was also Gaussian 
with the mean value of 0.95 and variance 0.05 restricted 
to the range [0.6, 1]. We found the evidence in this case 
to be 
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dn dh das p{n) p{h) p{as) C{n, h, ag) 
0.0008 , (11) 



which is lower than the evidence for the fiducial model. 
The likelihood contours for this model, after marginaliz- 
ing over as, are shown in Figure [T] This illustrates how 
adding more parameter freedom can dramatically dilute 
the evidence for the model, even if it fits the data very 
well. This is reflected in a very low value of the Ockham 
factor for this model, which is only 0.069. 



origin outside the standard cosmology. That is, there is 
some model like those considered in the previous sections, 
but we do not yet know what it is. 

We implement this idea by multiplying the diagonal el- 
ements of the curvature matrix corresponding to C2 and 
C3 by two constants (hereafter referred to as r2 and r^) 
that serve as the free parameters of our model. This has 
the effect of increasing the error bars of C2 and C3 . Fig- 
urelSlshows contours of the likelihood function for various 
values of these parameters. We have also evaluated the 
evidence for this model to be 



P{D\syst. 



dr2 dr3 p{r2) p{rz) C{r2,ra) = 0.0387 



(12) 

using flat priors on r2 and r3 in the intervals [1,200] and 
[1,150] respectively; these maxima are chosen for numer- 
ical convenience but the results are insensitive to them 
as long as ^ 1. It is also insensitive to whether we 
use a uniform prior on the or on Inr^. The latter are 
equivalent to P[ri) oc l/n, the so-called "Jefferys prior" 
appropriate for a scale parameter. 

Note that the likelihood is maximized when these pa- 
rameters reach their largest values: the data always be- 
come more likely when the error bars increase. Indeed, 
this implies that we can consider an even simpler model 
with parameters fixed at — > 00. This model has a 
likelihood of 0.0414, giving it a Bayes factor of 44 with 
respect to the conventional best fit. This model corre- 
sponds to ignoring the data at £ = 2, 3: there is no model 
that can improve the fit here by more than this roughly 
2.75a level. 

The evidence for these models implies that if the cor- 
rect model at low £ was indeed other than the "best fit" , 
there would be a roughly 2.75a level evidence that the 
error bars on C'2 and C3 were underestimated. 



IV. THEORETICAL AND EXPERIMENTAL 
SYSTEMATICS 

Having examined the possibility that the observed lack 
of power on large scales points in the direction of new 
physics, we now turn to the alternative that it can be 
attributed to data analysis methodology. The simplest 
case would be an underestimation of the errors corre- 
sponding to low multipoles. This would mean that we 
live in a universe described by the best fit power law 
model and that the discrepancy between its predictions 
and the WMAP data emanates from our miscalculating 
the aforementioned errors. Of course, we have copious 
evidence from the work done by the WMAP team itself 
as well as from comparison with other data that their 
data is likely to be reliable on these scales. Conversely, 
we could instead interpret this as saying that the ^ = 2, 3 
multipoles are correctly measured, but have an unknown 
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FIG. 8: Contours of the likelihood as a function of the pa- 
rameters r2 and ra. Shown are the 1 and 2 a contours. The 
likelihood is maximized in the upper right corner, where r2 
and rs are largest. 
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DISCUSSION 



Model 


Ockham factor 


Bayes factor 


a 


Best fit 




1 




Flat with cutoff 


0.441 


2.66 


1.40 a 


Closed {h) 


0.370 


3.62 


1.60 a 


Closed {h as n) 


0.069 


0.85 


0.57 a 


Large error bars 


0.945 


41.2 


2.73 a 



TABLE L Summary of the results of the paper. The Bayes 
factors, B, are all defined with respect to the "Best fit" model 
of the first row, and the column "a" is defined as -y/ 2 |ln B\. 
The Ockham factors are defined in the text, Section H] 

We summarize our results in Table d presenting the 
Bayes and Ockham factors for the models we have dis- 
cussed. Note that these numbers explicitly do not con- 
sider prior information about these models. Indeed, all 
of these models were explicitly constructed in response to 
the observed low power. In particular, the models with 
low primordial power considered in Section IIIII require 
that the scale of the power cutoff be fine tuned with re- 
spect to the horizon scale in order to reduce power at just 
the right angular scale, either by fiat or by determining 
the location of the curvature scale. A priori, such mod- 
els would be strongly disfavored. However, it has been 
recently pointed out in Ref . > that a cross-correlation 
between CMB and cosmic-shear patterns, as well as be- 
tween CMB and low-redshift tracers of the mass distri- 
bution, can provide a supplemental evidence for a large 
scale cutoff in the primordial spectrum. Such a cutoff 
would generally increase the cross-correlation. 

There are models with similar characteristics that 
have been discussed separately from these low-power is- 
sues: the class of models with non-trivial topologies 
mmiilillillim. We might assign a greater 
prior to such models, although again to explain the ob- 
servations requires fine tuning of the topology scale. In 
a recent paper de Oliveira-Costa et al |23| argued that 
the low power on large scales is unlikely to be a sign of 
non-trivial topology. We did not include these models 
into our analysis; however, one can expect them to have 
a similar evidence to the cutoff models we have consid- 
ered. Indeed, the type of CMB spectra that these two 
models produced are essentially the same and the differ- 
ence in the values of the evidence comes mainly from the 



prior on the free parameter. Note that models with non- 
trivial topology will also have other signatures, possibly 
observable in the CMB by considering properties beyond 
the power spectrum (see e.g., psj and references therein). 

Other analyses of these data have reached similar con- 
clusions. In Ref. Gaztafiaga et al performed a full 
covariance analysis of the WMAP data using the 2-point 
angular correlation and its higher-order moments. They 
have argued that the WMAP data is in a reasonable 
agreement with the ACDM model if WMAP data was 
considered as a particular realization of realistic ACDM 
simulations with the corresponding covariance. 

We have also considered a model that considers a pos- 
sible systematic error in the determination of the large- 
scale power. Although this model is experimentally un- 
likely, we can instead consider it as the reductio ad ab- 
surdum of all the possibilities we are considering: what 
happens if we just throw away the large scale data? From 
the Bayes factor of about 44 we see that there is likely 
no model at all that will ever improve the fit to the large 
scale by more than about 2.75a, in agreement with the 
somewhat different analysis of [23, and to some extent 
with that of the WMAP team itself [1 13|. It is worth 
noting that the phases of low harmonics could provide 
additional information about the plausibility of a cosmo- 
logical model; for instance, a model predicting an align- 
ment of the £=2,3 harmonics (according to j^) would be 
favoured with respect to a model making no such predic- 
tion, given that both models had the same power at low 
£. But we should point out that features like the align- 
ment of the low harmonics would not have any impact on 
the power at large scales. Consequently, no model will 
ever fare better than about 2.75a as far as power at large 
scales is concerned. 

However, there are other possibilities for probing the 
physics on the largest scales. In particular, a better mea- 
surement of the polarization of the CMB and its corre- 
lation with the intensity at these same multipoles will 
certainly enable us to cement the interpretation of the 
anisotropy at the same scales. 
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